IN THE CLAIMS : 



Please amend claims 11, 12, 14, and 15, cancel claims 8 and 10, and add new 
claims 16 and 17 as shown in this complete set of all pending claims: 

1. (Previously Presented) A method of detecting speech in an incoming signal 
comprising the steps of: 

receiving said incoming signal, extracting an estimate of the noise background of 
the incoming signal and suppressing the noise background of the 
incoming signal to provide a noise suppressed signal in which the 
estimated background noise has been removed, filtering the noise 
suppressed signal in which the background noise has been removed with 
a spectral inverse filter, said spectral inverse filter is determined by 
spectrum maxima and the inverse filtering operation comprising the steps 
of: 

in the logarithmic (dB) domain, removing the mean spectral magnitude from the 

original speech spectrum, 
in the mean removed short term frequency spectrum S(i), (i=l .. .128), 

determining all the frequency position (Pj), whose magnitudes are maxima 

over a window centered around Pj and stretching N positions to the left 

and right of Pj, 

in the list of peaks, adding the first (i=1) and last (i=1 28) frequency positions, 
their associated magnitudes set equal to the mean of the first and last M x 
N magnitudes, respectively, wherein said M and N are preset constants, 

removing the mean of the peak magnitudes from each peak magnitude, 

if the largest resulting peak magnitude exceeds a predetermined maximum peak 
value MAX_dB_DN, normalizing all peaks so that the largest peaks 
magnitude becomes MAX_dB_DN, and 

the resulting inverse filtering H(i) , (i=l ... 128) is defined as the maximum of the 
normalized peaks and 0 dB, and 



TI-35988 - 2 



removing the inverse filter from the original spectrum in the logarithmic domain 
U(i) = S(i) - H(i) and measuring the periodicity of the signal from the 
inverse filter using an autocorrelation function to determine whether a 
signal frame corresponds to a speech frame or not. 

2. (Original) The method of claim 1 wherein said periodicity measurement is defined 
as: 

n 

p=max Rx( r ) 

T, 

where 7/ and T h are pre-specified so that the period will range in the range of speech 
and the signal is speech if p is above a given threshold. 

3. (Original) The method of Claim 2 wherein said period is between about 75 Hz and 
400 Hz. 

4. (Previously Presented) The method of claim 2 where said threshold value is set to 
maximize speech detection accuracy. 

5. (Original) The method of claim 1 wherein said extracting step includes the steps of: 

converting the spectrum of the incoming signal into logarithmic domain, 
removing high frequency components in logarithmic domain by recurrent filtering 

along the time axis, 
establishing an estimate of noise background, 
converting the estimate into linear domain, and 
suppressing the noise background from the signal, in linear domain. 

6. - 10. (Canceled) 
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1 1 . (Currently Amended) Th e m e thod of c l a i m 10 A noise-resistant utterance detector 
comprising the steps of: 

accepting a speech utterance input signal, 

removing background noise from the utterance signal according to a spectral 

subtraction method to get a noise subtracted signal, 
inverse filtering the noise subtracted signal with a spectral inverse filter to get an 

inverse filtered signal, 
calculating the autocorrelation from the inverse filtered signal to get an 

autocorrelation result, and 
detecting that a frame of the signal being processed is or is not speech based on 

a threshold applied to the autocorrelation result, 
wherein said spectral inverse filter is determined by the steps of: 

in the logarithmic (dB) domain, removing the mean spectral magnitude 

from the original speech spectrum, 
in the mean removed short term frequency spectrum S(i), (/'=1 ...128), 
determining all the frequency position {Pj), whose magnitudes are 
maxima over a window centered around Pj and stretching N 
positions to the left and right of Pj, 
in the list of peaks, adding the first (i= 1) and last (i=128) frequency 
positions, their associated magnitudes set equal to the mean of the 
first and last M x N magnitudes, respectively, wherein said M and N 
are preset constants, 
removing the mean of the peak magnitudes from each peak magnitude, 
and 

if the largest resulting peak magnitude exceeds a predetermined 
maximum peak value MAX dB DN, normalizing all peaks so that 
the largest peaks magnitude becomes MAX_dB_DN, 

wherein the resulting inverse filter H(i), (/"=1 ...1 28) is defined as the 
maximum of the normalized peaks and 0 dB. 
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12. (Currently Amended) The m e thod noise-resistant utterance detector of claim 11 
wherein said M, N and MAX_dB_DN are pre-selected to have the following values: 
M=5, N=3 and MAX_dB_DN=3.5 dB. 

13. (Previously Presented) The method of claim 1 wherein said M, N and MAX dB DN 
are pre-selected to have the following values: M=5, N=3 and MAX_dB_DN=3.5 dB. 

14. (Currently Amended) The m e thod noise-resistant utterance detector of claim 11 
further comprising r e mov i ng sa i d i nv e rs e f il t e r from th e or i g i na l sp e ctrum i n th e 
l ogar i thm i c doma i n Uti) - Sti) Hti) locating close low-frequency formants in the noise 
subtracted signal if they exist and inserting spectral valleys between said formants 
before said inverse filtering . 

15. (Currently Amended) A method of determining if a signal includes speech, 
comprising: 

accepting an input signal; 

removing background noise from said input signal according to a spectral 

subtraction method to obtain a noise subtracted signal; 
inverse filtering said noise subtracted signal with a spectral inverse filter to 
obtain an inverse filtered signal , wh e r ei n sa i d i nv e rs e f il t e r i ng i s p e rform e d 




calculating the autocorrelation from said inverse filtered signal to get an 

autocorrelation result; and 
detecting that a frame of said input signal is or is not speech based on a 

threshold applied to said autocorrelation result A 
wherein said spectral inverse filter is determined by the steps of: 

in the logarithmic (dB) domain, removing the mean spectral magnitude 

from the original speech spectrum, 
in the mean removed short term frequency spectrum S(i), (/=1 ...1 28). 
determining all the frequency position (Pi), whose magnitudes are 




l og fr e qu e ncy doma i n and i s i mp le m e nt e d by subtract i ng an e st i mat e d 
irs e f il t e r i ng sp e ctrum from an or i g i na l sp e ctrum of sa i d i nput s i gna l; 
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maxima over a window centered around Pi and stretching N 

positions to the left and right of Pi, 
in the list of peaks, adding the first (i= 1) and last (i=128) frequency 

positions, their associated magnitudes set equal to the mean of the 

first and last M x N magnitudes, respectively, wherein said M and N 

are preset constants, 
removing the mean of the peak magnitudes from each peak magnitude, 

and 

if the largest resulting peak magnitude exceeds a predetermined 
maximum peak value MAX dB DN, normalizing all peaks so that 
the largest peaks magnitude becomes MAX dB DN, 
wherein the resulting inverse filter H(i). (i=1... 128) is defined as the maximum of 
the normalized peaks and 0 dB . 

16. (New) The method of claim 15 wherein said M, N and MAX dB DN are pre-selected 
to have the following values: M=5, N=3 and MAX_dB_DN=3.5 dB. 

17. (New) The method of claim 15 further comprising locating close low-frequency 
formants in the noise subtracted signal if they exist and inserting spectral valleys 
between said formants before said inverse filtering. 
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